<<<<<<< Updated upstream ======= <<<<<<< Updated upstream <<<<<<< Updated upstream <<<<<<< Updated upstream ======= ======= >>>>>>> Stashed changes >>>>>>> Stashed changes >>>>>>> Stashed changes Lab4_R4L
<<<<<<< Updated upstream

Team’s Part

Overall Questions

The first question that we would like to answer as a team is if mothers who smoke are more likely to have a premature birth as opposed to mothers who do not smoke. This is a very important question to be answered because there are a lot of health risks that come with a baby being born premature, and it would be very useful to know if smoking would lead to that. The Surgeon General’s report in 1989 states the following: “Mothers who smoke have increased rates of premature delivery (before 270 days).” We will make a plot to visually determine if there is any truth to this warning.

The second question that we would like to answer as a team is if the babies of mothers who smoke have lower birth weights than the babies of mothers who do not smoke. The Surgeon General’s report in 1989 says “The newborns of mothers who smoke have smaller birth weights at every gestational age (number of days into pregnancy when child is born).” We will try to determine if this is the case in the data set that we have.

Team’s Plots

babies <-na.omit(babies)
ggplot(data= babies, aes(x=Premature, fill=as.factor(smoke)))+
  geom_bar(position = "dodge")+
  scale_fill_discrete(name = "Mother who Smoke",
                      labels = c('No', 'Yes'))+
  ggtitle('Team Plot No.1')

babies <-na.omit(babies)
ggplot(data=babies)+
  geom_smooth(aes(x=gestation, y=bwtoz, color=as.factor(smoke)), se=FALSE)+
 scale_color_discrete(name = 'Does Mother Smoke',
                       labels = c('No', 'Yes'))+
  ggtitle('Team Plot No.2')
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Conclusion

Question 1

In the graph shown above, we are comparing the number of babies born premature vs. the number of babies not born premature, with two different colored bars separating smokers and non smokers. We see that for babies who are not born premature, most of them come from mothers who do not smoke. For babies who are born premature, more of them come from mothers who do smoke than from mothers who do not smoke. This is evidence that the Surgeon General’s report is accurate; mothers who smoke are more likely to have a premature birth as opposed to mothers who do not smoke.

Question 2

In the plot shown, we have 2 best fit lines for the birth weight in ounces vs. gestation age of babies for mothers who smoke and mothers who do not smoke. The graph shows that for the majority of gestation ages, babies who come from a mother who smokes weigh less than babies who come from a mother who does not smoke. The trend seems to reverse a little after the gestation age of around 300 days, but that is likely due to the fact that there are much fewer data points after 300 days, so it is not very relevant for our analysis.

Recommendation

Based on these results, we concur with the Surgeon General’s warning that mothers should not smoke during a pregnancy. Both of the key findings in the Surgeon General’s 1989 report were verified with the data set we used. We would recommend that women who are pregnant avoid smoking during their pregnancy.

Plots for Preliminary Question

babies <-na.omit(babies)
ggplot(data = babies, aes(x = med, fill = as.factor(smoke)))+
  geom_bar(position = 'dodge')+
  scale_fill_discrete(name = "Mother who Smoke",
                      labels = c('No', 'Yes'))+
  ggtitle('Education vs. Smoking Habits')

ggplot(data = babies, aes(x = inc, fill = as.factor(smoke)))+
  geom_bar(position = 'dodge')+
  scale_fill_discrete(name = "Mother who Smoke",
                      labels = c('No', 'Yes'))+
  ggtitle('Income vs. Smoking Habits')

Preliminary Question

Two of the variables that we determined had different distributions for smokers and non-smokers were the variables income and education. Based on the two bar plots where we plotted the distributions of education and income, with different colored bars for smokers and non-smokers, we can see that they are indeed a little different. Smokers tend to have less education and lower incomes than non-smokers. That could be a possible confounding variable and is something to consider.

Individual’s Part

Yen-Ting Shen

I, Yen-Ting Shen, made a bar chart that shows the relationship between mother’s age at termination of pregnancy, mother’s education background, and mother’s race. A question I wanted to answer is whether the education background of the mother affects the age they terminate the pregnancy, and also which race terminates the pregnancy the most. According to the plot I created, the mothers who only have a high school degree and did not attend any school after are more likely to terminate their pregnancy at the age between 20 to 25. In addition, most of the mothers who terminate their pregnancy are white and asian. I created this plot by using geom_bar function with x = mage. Furthermore since the “mrace” dataset from 0-5 are the same, TA helped me to create another column named “mrace2” which combine 0-5 and set them as 1. Then, I used aes fill=as.factor (mrace2) to make the plot more informative. The result surprised me because I thought the higher education a mother has, it is less likely that they will terminate their pregnancy. However, the higher education does not make mother to stop terminate their pregnancy, it only affects the age when they do it.

babies <-na.omit(babies)
babies <- babies %>%
  mutate(mrace2 = ifelse(mrace <= 5, 1, mrace))

ggplot(data=babies)+
  geom_bar(aes(x=mage, fill=as.factor(mrace2)))+
  facet_grid(med~.)+
  scale_fill_discrete(name = "Women's Race",
                      labels = c('White', 'Black', 'Asian', 'Mix', 'Unknown'))+
  xlab('Mothers Age Termination Pregnancy')+
  ggtitle('Mothers Age of termination, Education Background, and Race')

<<<<<<< Updated upstream

Emery Schattinger

The question that I wanted answered was if a pregnant women smokes, will it affect the weight of the baby when it is born. My original thoughts were that if a mother smokes than it will lower the weight of their baby when it is born. But, to answer this question a created a bar graph which depicts the relationship between pregnant women who smoked and didn’t smoke while pregnant and the weight of the babies when they were born. On the x-axis I put and named the weight of the babies is oz, and on the y-axis I had the amount of babies. Then I created a legend on the left side of the graph that distinguished the weight of the babies that had a smoking mothers in a blue color and the weight of the babies with non smoking mothers in a red color. When making this plot I had to change the names of the legend and the groups inside the legend to make it more clear. Also renamed the x and y-axis to show exactly what the data was. When deciding what colors to use, I decide on using the fill function instead of the color because it makes the plot a lot for understandable and easier to read. The plot shows that the correlation between mothers who smoked and the mothers who didn’t smoke is that the babies weight of the mothers who didn’t smoke on average were a lot higher than the weight of the babies who had a smoking mother. Therefore, if a mother smokes while pregnant their baby won’t be as healthy and will weigh less than a baby with a non smoking mother.

babies<- na.omit(babies)       
ggplot(data=babies)+
  geom_bar(mapping = aes(x=bwtoz, fill = as.factor(smoke)), position = "dodge")+
  ggtitle('How Smoking Can Affect A Babies Body Weight')+
  labs(y= "Babies", x = "Babies Wieght(Oz)")+
  scale_fill_discrete(name = 'Mother Smokes When Pregnant',
                       labels = c('No', 'Yes'))

  babies<- na.omit(babies)                  

Jason Giblin

I, Jason Giblin, made a bar chart detailing the relationship between the number of previous pregnancies a woman had and her income. A question I wanted to answer is if there was any correlation between the number of previous pregnancies a woman had and her income. I suspected that there might be a negative correlation, meaning woman who had more children would have a lower income. I did this by creating a geom_bar function with x as the variable “parity,” which in this data set corresponds to the number of previous pregnancies a woman had. I then coded the fill of the bars to be her income, which in this data set was an ordered categorical variable that was given values from 0 to 9, with each higher number indicating a higher level of income. I then used the position = ‘dodge’ function in order to separate the different bars by colors and to make the chart more readable. Finally, I added a title, labels to the graph, and a legend in order to make the graph easily understandable. The graph shows that there is no apparent correlation between the income of a woman and the number of previous pregnancies she had. This was contrary to my original thought. I confirmed this by running a correlation between income and parity, and found a very small correlation of .003.

======= >>>>>>> Stashed changes
babies<-na.omit(babies)
ggplot(data = babies, aes(x = as.factor(parity), fill =as.factor(inc)))+
  geom_bar(position = 'dodge')+
  scale_fill_discrete(name = 'Income',
                      labels = c('Under $2500', '$2500-$4999', '$5000-$9999', '$10000-$19999', '$20000-$29999', '$30000-$49999', '$50000-$74999', '$75000-$99999', '$100000-$149999', '$150000+'))+
  xlab('Number of Previous Pregnancies')

<<<<<<< Updated upstream
 cor(babies$parity, babies$inc)
## [1] 0.003267679

Baiyu Chen

The dense and sparse of the point can reflect the educated level between two group of people.The plot showed that more people who never smoke had took higher level education rather than people who smoke, for people who never smoke have more dense point on 4 and 5 rather than people who smoke. Furthermore, most of smoker had took lower level education rather that higher level education, we can see there are denser point in 1 and 2 rather than others level. About what I built the plot, I use the jitter function to make this plot, jitter function can easily show us the density of the variable. Furthermore,I change the color of the poings. Also, I add my personal label to each facet and the plot instead of the default label.

require(ggplot2) 
library(dplyr) 
 
babies <-na.omit(babies) 
 
babies <- babies %>% 
  mutate(med2 = ifelse(med==7 , 6, med)) 
 
ggplot(data = babies)+ 
  geom_jitter(mapping = aes(x=med2, y=mage, color=as.factor(med2)))+ 
  facet_wrap(~smoke,labeller=labeller(smoke = c('0' = 'Never Smoke', '1' = 'Smoke Now')))+ 
   scale_color_discrete(name = 'Mothers education',                                     labels = c('0 = less than 8th grade', '1 = 8th to 12th grade. did not graduate high school', '2 = high school graduate, no other schooling', '3 = high school graduate + trade school', '4 = high school graduate + some college', '5 = college graduate', '6 = trade school but unclear if graduated from high school'))+ 
  scale_x_continuous(breaks=c(0,1,2,3,4,5,6),labels = c('0','1','2','3','4','5','6'))+ 
  theme_grey()+ 
  xlab("Mothers Education")+ 
  ylab("Mothers Age")+ 
  ggtitle("Educated Level Between Smoker and Non-smoker") 

Tiger Su

The following plot is about the relationships between Mother’s Education, age, numbers of previous pregnancies, and amount of cigs they smoke in a day. Facet_wrap function helps indicates a different amount of education the mother has and separate them into different sub-plots. In addition, I use color=as.factor() function to different the number of cigs has been smoked in a day, which shows that half of the data population have smoked before, and lower education(high school graduate or lower) has a higher population of smokers. Nevertheless, the geom_point function also reflects that the number of previous pregnancies and age proportional to each other, whereas, the number of cigs smoked in a day goes the opposite way. Last but not least, I use geom_smooth function to indicate a general pattern of the data. Finally, for aesthetic purposes, I move the labels for numbers of cigs in a day to the bottom of the plot, by using the function " theme(legend.position = “bottom”)“.

babies <- na.omit(babies)

ggplot(data=babies)+
  
  ggtitle('Amount of Cigs vs. Mothers Education, Age, Previous Pregnancies')+
  
  geom_point(aes(x=mage,y=parity,color=as.factor(number)))+
  
   facet_wrap( ~ med,nrow =2,labeller=labeller(med=c('0'='Less Than 8th Grade','1'='8th-12th Grade','2'='High School Graduate','3'='Trade School','4'='Some College','5'='College Graduate','7'='Trade School Unknown High School')))+
  
  geom_smooth(aes(x=mage,y=parity),se = FALSE)+
  
  scale_color_discrete(name = 'Numbers of Cigs in A Day',                                     labels = c('Never Smoked', '1-4', '5-9', '10-14', '15-19', '20-29', '30-39','40-60','60+'))+

  xlab("Mothers Age")+
 
  ylab("Numbers of Previous Pregnancies")+
  
  theme(legend.position = "bottom")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Who did what?

<<<<<<< Updated upstream ======= <<<<<<< Updated upstream ======= ======= >>>>>>> Stashed changes Lab4_R4L

Team’s Part

Overall Questions

The first question that we would like to answer as a team is if mothers who smoke are more likely to have a premature birth as opposed to mothers who do not smoke. This is a very important question to be answered because there are a lot of health risks that come with a baby being born premature, and it would be very useful to know if smoking would lead to that. The Surgeon General’s report in 1989 states the following: “Mothers who smoke have increased rates of premature delivery (before 270 days).” We will make a plot to visually determine if there is any truth to this warning.

The second question that we would like to answer as a team is if the babies of mothers who smoke have lower birth weights than the babies of mothers who do not smoke. The Surgeon General’s report in 1989 says “The newborns of mothers who smoke have smaller birth weights at every gestational age (number of days into pregnancy when child is born).” We will try to determine if this is the case in the data set that we have.

Team’s Plots

ggplot(data=babies, aes(x=bwtoz, y=gestation))+
  geom_point()

###Conclusion

Recommendation

Plots for Preliminary Question

Individual’s Part

Yen-Ting Shen

babies <-na.omit(babies)
babies <- babies %>%
  mutate(mrace2 = ifelse(mrace <= 5, 1, mrace))

ggplot(data=babies)+
  geom_bar(aes(x=mage, fill=as.factor(mrace2)))+
  facet_grid(med~.)+
  scale_fill_discrete(name = "Women's Race",
                      labels = c('White', 'Black', 'Asian', 'Mix', 'Unknown'))+
  xlab('Mothers Age Termination Pregnancy')

Emery Schattinger

The question that I wanted answered was if a pregnant women smokes, will it affect the weight of the baby when it is born. My original thoughts were that if a mother smokes than it will lower the weight of their baby when it is born. But, to answer this question a created a bar graph which depicts the relationship between pregnant women who smoked and didn’t smoke while pregnant and the weight of the babies when they were born. On the x-axis I put and named the weight of the babies is oz, and on the y-axis I had the amount of babies. Then I created a legend on the left side of the graph that distinguished the weight of the babies that had a smoking mothers in a blue color and the weight of the babies with non smoking mothers in a red color. When making this plot I had to change the names of the legend and the groups inside the legend to make it more clear. Also renamed the x and y-axis to show exactly what the data was. When deciding what colors to use, I decide on using the fill function instead of the color because it makes the plot a lot for understandable and easier to read. The plot shows that the correlation between mothers who smoked and the mothers who didn’t smoke is that the babies weight of the mothers who didn’t smoke on average were a lot higher than the weight of the babies who had a smoking mother. Therefore, if a mother smokes while pregnant their baby won’t be as healthy and will weigh less than a baby with a non smoking mother.

ggplot(data=babies)+
  geom_bar(mapping = aes(x=bwtoz, fill = as.factor(smoke)), position = "dodge")+
  ggtitle('How Smoking Can Affect A Babies Body Weight')+
  labs(y= "Babies", x = "Babies Wieght(Oz)")+
  scale_fill_discrete(name = 'Mother Smokes When Pregnant',
                       labels = c('No', 'Yes'))

  babies<- na.omit(babies)                  

Jason

I, Jason Giblin, made a bar chart detailing the relationship between the number of previous pregnancies a woman had and her income. A question I wanted to answer is if there was any correlation between the number of previous pregnancies a woman had and her income. I suspected that there might be a negative correlation, meaning woman who had more children would have a lower income. I did this by creating a geom_bar function with x as the variable “parity,” which in this data set corresponds to the number of previous pregnancies a woman had. I then coded the fill of the bars to be her income, which in this data set was an ordered categorical variable that was given values from 0 to 9, with each higher number indicating a higher level of income. I then used the position = ‘dodge’ function in order to separate the different bars by colors and to make the chart more readable. Finally, I added a title, labels to the graph, and a legend in order to make the graph easily understandable. The graph shows that there is no apparent correlation between the income of a woman and the number of previous pregnancies she had. This was contrary to my original thought. I confirmed this by running a correlation between income and parity, and found a very small correlation of .003.

babies<-na.omit(babies)
ggplot(data = babies, aes(x = as.factor(parity), fill =as.factor(inc)))+
  geom_bar(position = 'dodge')+
  scale_fill_discrete(name = 'Income',
                      labels = c('Under $2500', '$2500-$4999', '$5000-$9999', '$10000-$19999', '$20000-$29999', '$30000-$49999', '$50000-$74999', '$75000-$99999', '$100000-$149999', '$150000+'))+
  xlab('Number of Previous Pregnancies')

 cor(babies$parity, babies$inc)
## [1] 0.003267679

Baiyu

require(ggplot2) 
library(dplyr) 
 
babies <-na.omit(babies) 
 
babies <- babies %>% 
  mutate(med2 = ifelse(med==7 , 6, med)) 
 
ggplot(data = babies)+ 
  geom_jitter(mapping = aes(x=med2, y=mage, color=as.factor(med2)))+ 
  facet_wrap(~smoke,labeller=labeller(smoke = c('0' = 'Never Smoke', '1' = 'Smoke Now')))+ 
   scale_color_discrete(name = 'Mothers education',                                     labels = c('0 = less than 8th grade', '1 = 8th to 12th grade. did not graduate high school', '2 = high school graduate, no other schooling', '3 = high school graduate + trade school', '4 = high school graduate + some college', '5 = college graduate', '6 = trade school but unclear if graduated from high school'))+ 
  scale_x_continuous(breaks=c(0,1,2,3,4,5,6),labels = c('0','1','2','3','4','5','6'))+ 
  theme_grey()+ 
  xlab("Mothers Education")+ 
  ylab("Mothers Age")+ 
  ggtitle("Educated Level Between Smoker and Non-smoker") 

Who did what?

>>>>>>> Stashed changes ======= <<<<<<< Updated upstream ======= Lab4_R4L

Team’s Part

Overall Questions

Team’s Plots

ggplot(data=babies, aes(x=bwtoz, y=gestation))+
  geom_point()
## Warning: Removed 10 rows containing missing values (geom_point).

cor(babies$parity, babies$inc)
## [1] NA

Conclusion

Recommendation

Plots for Preliminary Question

Individual’s Part

Yen-Ting Shen

babies <-na.omit(babies)
babies <- babies %>%
  mutate(mrace2 = ifelse(mrace <= 5, 1, mrace))

ggplot(data=babies)+
  geom_bar(aes(x=mage, fill=as.factor(mrace2)))+
  facet_grid(med~.)+
  scale_fill_discrete(name = "Women's Race",
                      labels = c('White', 'Black', 'Asian', 'Mix', 'Unknown'))+
  xlab('Mothers Age Termination Pregnancy')

Emery

ggplot(data=babies)+
  geom_bar(mapping = aes(x=bwtoz, fill = as.factor(smoke)), position = "dodge")+
  ggtitle('How smoking can affect babies body weight')+
  scale_fill_discrete(name = 'Does Mother Smoke',
                       labels = c('Yes', 'No'))

Jason

babies<-na.omit(babies)
ggplot(data = babies, aes(x = as.factor(parity), fill =as.factor(inc)))+
  geom_bar(position = 'dodge')+
  scale_fill_discrete(name = 'Income',
                      labels = c('Under $2500', '$2500-$4999', '$5000-$9999', '$10000-$19999', '$20000-$29999', '$30000-$49999', '$50000-$74999', '$75000-$99999', '$100000-$149999', '$150000+'))+
  xlab('Number of Previous Pregnancies')

Baiyu

require(ggplot2) 
library(dplyr) 
 
babies <-na.omit(babies) 
 
babies <- babies %>% 
  mutate(med2 = ifelse(med==7 , 6, med)) 
 
ggplot(data = babies)+ 
  geom_jitter(mapping = aes(x=med2, y=mage, color=as.factor(med2)))+ 
  facet_wrap(~smoke,labeller=labeller(smoke = c('0' = 'Never Smoke', '1' = 'Smoke Now')))+ 
   scale_color_discrete(name = 'Mothers education',                                     labels = c('0 = less than 8th grade', '1 = 8th to 12th grade. did not graduate high school', '2 = high school graduate, no other schooling', '3 = high school graduate + trade school', '4 = high school graduate + some college', '5 = college graduate', '6 = trade school but unclear if graduated from high school'))+ 
  scale_x_continuous(breaks=c(0,1,2,3,4,5,6),labels = c('0','1','2','3','4','5','6'))+ 
  theme_grey()+ 
  xlab("Mothers Education")+ 
  ylab("Mothers Age")+ 
  ggtitle("Educated Level Between Smoker and Non-smoker") 

Who did what?

>>>>>>> Stashed changes ======= >>>>>>> Stashed changes >>>>>>> Stashed changes
  babies<- na.omit(babies)                  
=======

Emery

=======
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
babies <- read_csv("https://raw.githubusercontent.com/ervance1/Sp2018-Data-Science-repo/master/babies2a.dat")
## Parsed with column specification:
## cols(
##   .default = col_double()
## )
## See spec(...) for full column specifications.
babies <- rename(babies, bwtoz = `bwt/oz`)
babies <- babies %>%
  mutate(mrace2 = ifelse(mrace <= 5, 1, mrace))

ggplot(data=babies) +
  geom_bar(aes(x= med, fill=as.factor(mrace)), position = "Dodge")+
  geom_smooth(aes(x=med, y= mage), se=FALSE)+
  scale_fill_discrete(name = "Women's Race",
                      labels(c='White', 'Mexican', 'Black', 'Asian', 'Mix', 'Unknown'))
## Warning: Removed 1 rows containing non-finite values (stat_count).
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Computation failed in `stat_smooth()`:
## x has insufficient unique values to support 10 knots: reduce k.

babies <-na.omit(babies)
>>>>>>> Stashed changes
ggplot(data=babies)+
  geom_bar(mapping = aes(x=bwtoz, fill = as.factor(smoke)), position = "dodge")+
  ggtitle('How smoking can affect babies body weight')+
  scale_fill_discrete(name = 'Does Mother Smoke',
                       labels = c('Yes', 'No'))

<<<<<<< Updated upstream >>>>>>> Stashed changes

Yen-Ting Shen: I created a bar plot, format the Rmarkdown file, booked the study room for out team’s meeting, and created the first team plot with Jason.

Jason Giblin: In addition to my individual work, I did a lot to contribute to the team questions. I created the graph for the second team question, and helped Yen-Ting create the plot for the first team question. I also wrote both conclusions for the two team questions, and additionally wrote the recommendation.

Emery Schattinger: For my individual work I created a bar plot. Then I helped generate ideas for the team plots.

Baiyu Chen: For individual, I work made a jitter plot. In addition, I helped the team to find how to label the facet.

Tiger Su: To be honest, I didn’t help much for this week’s team plot due to my laptop got stolen. I went to one of the group meetings finished my individual part and discussed a little about the team part.

=======
cor(babies$parity, babies$inc)
## [1] 0.003267679

Baiyu’s part:

require(ggplot2)
library(dplyr)

babies <-na.omit(babies)

babies <- babies %>%
  mutate(med2 = ifelse(med==7 , 6, med))

ggplot(data = babies)+
  geom_jitter(mapping = aes(x=med2, y=mage, color=as.factor(med2)))+
  facet_wrap(~smoke,labeller=labeller(smoke = c('0' = 'Never Smoke', '1' = 'Smoke Now')))+
   scale_color_discrete(name = 'Mothers education',                                     labels = c('0 = less than 8th grade', '1 = 8th to 12th grade. did not graduate high school', '2 = high school graduate, no other schooling', '3 = high school graduate + trade school', '4 = high school graduate + some college', '5 = college graduate', '6 = trade school but unclear if graduated from high school'))+
  scale_x_continuous(breaks=c(0,1,2,3,4,5,6),labels = c('0','1','2','3','4','5','6'))+
  theme_grey()+
  xlab("Mothers Education")+
  ylab("Mothers Age")+
  ggtitle("Educated Level Between Smoker and Non-smoker")

babies <- na.omit(babies)

ggplot(data=babies)+
  
  ggtitle('How the Amount of Smoking Relate to Mothers Education Age And Number of Previous Pregnancies')+
  
  geom_point(aes(x=mage,y=parity,color=as.factor(number)))+
  
   facet_wrap( ~ med,nrow =2,labeller=labeller(med=c('0'='Less Than 8th Grade','1'='8th-12th Grade','2'='High School Graduate','3'='High School Graduate','4'='Trade School','5'='Some College','6'='College Graduate','7'='Trade School')))+
  
  geom_smooth(aes(x=mage,y=parity),se = FALSE)+
  
  scale_color_discrete(name = 'Numbers of Cigs in A Day',                                     labels = c('Never Smoked', '1-4', '5-9', '10-14', '15-19', '20-29', '30-39','40-60','60+'))+

  xlab("Mothers Age")+
 
  theme(legend.position = "bottom")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

>>>>>>> Stashed changes >>>>>>> Stashed changes